Optimal Policies for Partially Observable Markov Decision Processes
نویسندگان
چکیده
منابع مشابه
A POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems
Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...
متن کاملOptimal Control for Partially Observable Markov Decision Processes over an Infinite Horizon
In this paper we consider an optimal control problem for partially observable Markov decision processes with finite states, signals and actions OVE,r an infinite horizon. It is shown that there are €optimal piecewise·linear value functions and piecl~wise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set...
متن کاملGeometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes
It is well known that any finite state Markov decision process (MDP) has a deterministic memoryless policy that maximizes the discounted longterm expected reward. Hence for such MDPs the optimal control problem can be solved over the set of memoryless deterministic policies. In the case of partially observable Markov decision processes (POMDPs), where there is uncertainty about the world state,...
متن کاملSolving POMDPs by Searching the Space of Finite Policies
Solving partially observable Markov decision processes (POMDPs) is highly intractable in general, at least in part because the optimal policy may be infinitely large. In this paper, we explore the problem of finding the optimal policy from a restricted set of policies, represented as finite state automata of a given size. This problem is also intractable, but we show that the complexity can be ...
متن کاملNonapproximability Results for Partially Observable Markov Decision Processes
We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for nding control policies are unlikely to or simply don't have guarantees of nding policies within a constant factor or a constant summand of optimal. Here \unlikely" means \unless some complexity classes collapse," where the collapses considered are P = NP, P = PSPACE, or P = EXP....
متن کامل